E ective Variable - Length - to - Fixed - Length Coding via a Re - Pair Algorithm

نویسندگان

  • Satoshi Yoshida
  • Takuya Kida
چکیده

We address the problem of improving variable-length-toxed-length codes (VF codes). A VF code is an encoding scheme that uses a xed-length code, and thus, one can easily access the compressed data. However, conventional VF codes usually have an inferior compression ratio to that of variable-length codes. Although a method proposed by T. Uemura et al. in 2010 achieves a good compression ratio comparable to that of gzip, it is very time consuming. In this study, we propose a new VF coding method that applies a xed-length code to the set of rules extracted by the Re-Pair algorithm, proposed by N. J. Larsson and A. Mo at in 1999. The Re-Pair algorithm is a simple o -line grammarbased compression method that has good compression-ratio performance with moderate compression speed. Moreover, we present several experimental results to show that the proposed coding is superior to the existing VF coding.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تخمین مکان نواحی کدکننده پروتئین در توالی عددی DNA با استفاده پنجره با طول متغیر بر مبنای منحنی سه بعدی Z

In recent years, estimation of protein-coding regions in numerical deoxyribonucleic acid (DNA) sequences using signal processing tools has been a challenging issue in bioinformatics, owing to their 3-base periodicity. Several digital signal processing (DSP) tools have been applied in order to Identify the task and concentrated on assigning numerical values to the symbolic DNA sequence, then app...

متن کامل

On Coding Navigation Paths for In-Memory Navigation in Persistent Object Stores

We consider matrix index and navigation index approaches to in-memory navigation of persistent object stores. We demonstrate that both approaches can be re-formulated independently from the used coding technique. We expose the limitations of fixed length coding and the inefficiency of simple continued fractions as a variable length coding technique. The conclusion is that alternative variable l...

متن کامل

LZAC Lossless Data Compression

This paper presents LZAC, a new universal lossless data compression algorithm derived from the popular and widely used LZ77 family. The objective of LZAC is to improve the compression ratios of the LZ77 family while still retaining the family’s key characteristics: simple, universal, fast in decoding, and economical in memory consumption. LZAC presents two new ideas: composite fixed-variable-le...

متن کامل

Variable Length Lossless Coding for Variational Distance Class: An Optimal Merging Algorithm

In this paper we consider lossless source coding for a class of sources specified by the total variational distance ball centred at a fixed nominal probability distribution. The objective is to find a minimax average length source code, where the minimizers are the codeword lengths – real numbers for arithmetic or Shannon codes – while the maximizers are the source distributions from the total ...

متن کامل

A Lossless re-Encoding of MPEG-2 Coded file by Integrating Four Motion Vectors

Re-encoding of once compressed files is one of the difficult challenges in measuring the efficiency of coding methods. Variable length coding with a variable source delimiting scheme is a promising method for improving re-encoding efficiency. Analyses of coded files with fixed length delimiting and with variable length delimiting are reviewed. Motion vector codes of MPEG-2 encoded files are mod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012